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Abstract 

We study blind fingerprinting, where the host sequence into which fingerprints are embedded is 
partially or completely unknown to the decoder. This problem relates to a multiuser version of the 
Gel'fand-Pinsker problem. The number of colluders and the collusion channel are unknown, and the 
colluders and the fingerprint embedder are subject to distortion constraints. 

We propose a conditionally constant-composition random binning scheme and a universal decoding 
rule and derive the corresponding false-positive and false-negative error exponents. The encoder is a 
stacked binning scheme and makes use of an auxiliary random sequence. The decoder is a maximum 
doubly-penalized mutual information decoder, where the significance of each candidate coalition is 
assessed relative to a threshold that trades off false-positive and false-negative error exponents. The 
penalty is proportional to coalition size and is a function of the conditional type of host sequence. 
Positive exponents are obtained at all rates below a certain value, which is therefore a lower bound on 
pubUc fingerprinting capacity. We conjecture that this value is the public fingerprinting capacity. A simpler 
threshold decoder is also given, which has similar universality properties but also lower achievable rates. 
An upper bound on public fingerprinting capacity is also derived. 

Index Terms. Fingerprinting, traitor tracing, watermarking, data hiding, randomized codes, universal 
codes, method of types, maximum mutual information decoder, minimum equivocation decoder, channel 
coding with side information, random binning, capacity, error exponents, multiple access channels, model 
order selection. 
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I. Introduction 

Content fingerprinting finds applications to document protection for multimedia distribution, broad- 
casting, and traitor tracing [l]-[4]. A covertext — image, video, audio, or text — ^is to be distributed to 
many users. A fingerprint, a mark unique to each user, is embedded into each copy of the covertext. In 
a collusion attack, several users may combine their copies in an attempt to "remove" their fingerprints 
and to forge a pirated copy. The distortion between the pirated copy and the colluding copies is bounded 
by a certain tolerance level. To trace the forgery back to the coalition members, we need fingerprinting 
codes that can reliably identify the fingerprints of those members. Essentially, from a communication 
viewpoint, the fingerprinting problem is a multiuser version of the watermarking problem [5]-[10]. For 
watermarking, the attack is by one user and is based on one single copy, whereas for fingerprinting, the 
attack is modeled as a multiple-access channel (MAC). The covertext plays the role of side information 
to the encoder and possibly to the decoder. 

Depending on the availability of the original covertext to the decoder, there are two basic versions 
of the problem: private and public. In the private fingerprinting setup, the covertext is available to both 
the encoder and decoder. In the public fingerprinting setup, the covertext is available to the encoder but 
not to the decoder, and thus decoding performance is generally worse. However public fingerprinting 
presents an important advantage over private fingerprinting, in that it does not require the vast storage 
and computational resources that are needed for media registration in a large database. For example, a 
DVD player could detect fingerprints from a movie disc and refuse to play it if fingerprints other than 
the owner's are present. Or Web crawling programs can be used to automatically search for unauthorized 
content on the Internet or other public networks [3]. 

The scenario considered in this paper is one where a degraded version S'^ of each host symbol S is 
available to the decoder. Private and public fingerprinting are obtained as special cases with S''' = S 
and S"^ = 0, respectively. We refer to this scenario as either blind or semiprivate fingerprinting. The 
motivation is analogous to semiprivate watermarking [11], where some information about the host signal 
is provided to the receiver in order to improve decoding performance. This may be necessary to guarantee 
an acceptable performance level when the number of coUuders is large. 

The capacity and rehabihty hmits of private fingerprinting have been studied in [7]-[10]. The decoder 
of [10] is a variation of Liu and Hughes' minimum equivocation decoder [12], accounting for the presence 
of side information and for the fact that the number of channel inputs is unknown. Two basic types of 
decoders are of interest: detect-all and detect-one. The detect-all decoder aims to catch all members of 
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the coalition and an error occurs if some coUuder escapes detection. The detect-one decoder is content 
with catching at least one of the culprits and an error occurs only when none of the coUuders is identified. 
A third type of error (arguably the most damaging one) is a false positive, by which the decoder accuses 
an innocent user 

In the same way as fingerprinting is related to the MAC problem, blind fingerprinting is related to 
a multiuser extension of the Gel'fand-Pinsker problem. The capacity region for the latter problem is 
unknown. An inner region, achievable using random binning, was given in [13]. 

This paper derives random-coding exponents and an upper bound on detect-all capacity for semiprivate 
fingerprinting. Neither the encoder nor the decoder know the number of coUuders. The collusion channel 
has arbitrary memory but is subject to a distortion constraint between the pirated copy and the colluding 
copies. Our fingerprinting scheme uses random binning because, unlike in the private setup, the availability 
of side information to the encoder and decoder is asymmetric. To optimize the error exponents, we propose 
an extension of the stacked-binning scheme that was developed for single-user channel coding with side 
information [11]. Here the codebook consists of a stack of variable-size codeword-arrays indexed by the 
conditional type of the covertext sequence. The decoder is a minimum doubly-penalized equivocation 
(M2PE) decoder or equivalently, a maximum doubly-penalized mutual information (M2PMI) decoder. 

The proposed fingerprinting system is universal in that it can cope with unknown collusion channels 
and unknown number of coUuders, as in the private fingerprinting setup of [10]. A tunable parameter A 
trades off false-positive and false-negative error exponents. The derivation of these exponents combines 
techniques from [10] and [11]. A preliminary version of our work, assuming a fixed number of coUuders, 
was given in [14], [15]. 

A. Organization of This Paper 

A mathematical statement of our generic fingerprinting problem is given in Sec. |lll together with the 
basic definitions of error probabilities, capacity, error exponents, and fair coalitions. Sec. |lll] presents 
our random coding scheme. Sec. |IV] presents a simple but suboptimal decoder that compares empirical 
mutual information scores between received data and individual fingerprints, and outputs a guilty decision 
whenever the score exceeds a certain tunable threshold. Sec. |V] presents a joint decoder that assigns a 
penalized empirical mutual information score to candidate coalitions and selects the coalition with the 
highest score. Sec. |Vl] establishes an upper bound on blind fingerprinting capacity under the detect-all 
criterion. Finally, conclusions are given in Sec. IVIU The proofs of the theorems are given in appendices. 
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B. Notation 

We use uppercase letters for random variables, lowercase letters for their individual values, calligraphic 
letters for finite alphabets, and boldface letters for sequences. We denote by 7W* the set of sequences 
of arbitrary length (including 0) whose elements are in Al. The probability mass function (p.m.f.) of a 
random variable X X is denoted by px = {px{x), x € X}. The entropy of a random variable X is 
denoted by H{X), and the mutual information between two random variables X and Y is denoted by 
/(X; Y) = H{X) — H{X\Y). Should the dependency on the underlying p.m.f.s be explicit, we write the 
p.m.f.s as subscripts, e.g., Hp^{X) and Ip^^pY^^{X;Y). The KuUback-Leibler divergence between two 
p.m.f.s p and q is denoted by D{p\\q); the conditional KuUback-Leibler divergence of Py\x ^^'^ Qy\x 
given is denoted by D{py\x\\qy\x\px) = D{py\x Px\\qy\x Px)- AH logarithms are in base 2 unless 
specified otherwise. 

Denote by p^ the type, or empirical p.m.f. induced by a sequence x G . The type class is 
the set of all sequences of type p^- Likewise, we denote by the joint type of a pair of sequences 
(x, y) G X y'^ and by T^y the type class associated with pxy- The conditional type Py|x of a pair 
of sequences (x, y) is defined by P:xy{x,y) /py^{x) for all x € such that Px(a;) > 0. The conditional 
type class Ty|x given x, is the set of all sequences y such that (x, y) G Txy. We denote by -ff(x) the 
empirical entropy of the p.m.f. px> by H{y\x) the empirical conditional entropy, and by /(x; y) the 
empirical mutual information for the joint p.m.f. pxy- 

We use the calligraphic fonts and to represent the set of all p.m.f.s and all empirical p.m.f.'s, 
respectively, on the alphabet X. Likewise, ^y\x and ^^y\x ^^1^°^^ t^e set of all conditional p.m.f.s and 
all empirical conditional p.m.f.'s on the alphabet 3^. A special symbol will be used to denote the 
feasible set of collusion channels Py\x^,--- ,Xk ^^'^ ^^"^ selected by a svLt-K coalition. 

Mathematical expectation is denoted by the symbol E. The shorthands = bj\j and <bN denote 
asymptotic relations in the exponential scale, respectively limx-^oo log |^ = and limsupx^oo 

log 1^ < 0. We define |f|+ = max(t, 0) and exp2(t) = 2*. The indicator function of a set A is 
denoted by Finally, we adopt the convention that the minimum of a function over an empty set 

is +00 and the maximum of a function over an empty set is 0. 

II. Statement of the Problem 

A. Overview 

Our model for blind fingerprinting is diagrammed in Fig. [T] Let S, X, and y be three finite alphabets. 
The covertext sequence S = {Si, • • • , Sn) € consists of N independent and identically distributed 
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Fig. 1. Model for semiprivate (blind) fingerprinting game, where S is a degraded version of the covertext S. Private and 
public fingerprinting arise as special cases with S"^ = S and S'' — 0, respectively. 



(i.i.d.) samples drawn from a p.m.f. ps{s), s £ S. A random variable V taking values in an alphabet Vn is 
shared between encoder and decoder, and not publicly revealed. The random variable V is independent of 
S and plays the role of a cryptographic key. There are users, each of which receives a fingerprinted 
copy: 

X^ = /7v(S,y,m), l<m<2^«, (2.1) 

where Jn : x Vat x {1, ■ ■ ■ ,2^^} —>■ is the encoding function, and m is the index of the 
user. The encoder binds each fingerprinted copy Xm to the covertext s via a distortion constraint. Let 
d : S X X ^ M+ be the distortion measure and d^(s,x) = jqYld=i'^{^ii^i) extension of this 
measure to length-A^ sequences. The code /at is subject to the distortion constraint 



(i^(s,x„) < L»i l<m<2 



NR 



(2.2) 



Let /C = {mi, 1112 • • • , ^k} be a coalition of K users, called coUuders. No constraints are imposed 
on the formation of coalitions. The coUuders combine their copies Xy^ = {X^, rn € /C} to produce 
a pirated copy Y G 3^^. Without loss of generality, we assume that Y is generated stochastically as 
the output of a collusion channel Py|Xk;- Fidelity constraints are imposed on Py|Xk to ensure that Y 
is "close" to the fingerprinted copies X^, m G )C. These constraints can take the form of distortion 
constraints, analogously to (12.21 ). They are formulated below and result in the definition of a feasible 
class Wk of attacks. 

The decoder knows neither K nor Py selected by the K coUuders and has access to the pirated copy 
Y, the secret key V, as well as to S'^, a degraded version of the host S. To simplify the exposition, the 
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degradation arises via a deterministic symbolwise mapping h : S —> S'^. The sequence s"^ = h{s) could 
represent a coarse version of s, or some otlier features of s. Two special cases are private fingerprinting 
where S**^ = S, and public fingerprinting where = 0. The decoder produces an estimate 

t = gN{Yy,V) (2.3) 

of the coalition. A possible decision is the empty set, K, = which is the reasonable choice when an 
accusation would be unreliable. To summarize, we have 

Definition 2.1: A randomized rate-i? length- iV fingerprinting code (/jv, Qn) with embedding distortion 
-Di is a pair of encoder mapping /jv : 5^ x Vjv x {1, 2, • • • , 2^^} ^ and decoder mappmg 
gN ■ y''^^ X Vat ^ {1, 2, • • • , 2^^}*. 

The randomization is via the secret key V and can take the form of permutations of the symbol 
positions {1, 2, • • • , N}, permutations of the 2^^ fingerprint assignments, and an auxiliary time-sharing 
sequence, as in [6] — [10], [16]. 

We now state the attack models and define the error probabihties, capacities, and error exponents. 

B. Collusion Channels 

The conditional type Py\^^ is a random variable whose conditional distribution given yijc depends on 
the collusion chaimel Py|Xk;- Our fidehty constraint on the coahtion is of the general form 

Pr[py|,, G Wk] = 1, (2.4) 

where Wk is a convex subset of ^y\Xk. - "^^^t is, the empirical conditional p.m.f. of the pirated copy given 
the marked copies is restricted. Examples of Wk are given in [10], including hard distortion constraints 
on the coalition: 

'^K = lpY\Xf: •■ PxAxic)PY\xAy\^ic)^ci,d2i(f>{xjc),y) < Di > (2.5) 
I xic,y ) 

where (f) : ^ 5 is a (possible randomized) permutation-invariant estimator S = ^{Xjc) of each host 

signal sample based on the corresponding marked samples; ^2 : 5 ^ is the coalition's distortion 

function; px^ is a reference p.m.f.; and D2 is the maximum allowed distortion. Another possible choice 

for Wk is obtained using the Boneh-Shaw constraint [1], [10]. 

Fair Coalitions. Denote by tt a permutation of the elements of /C. The set of fair, feasible collusion 

chaimels is the subset of Wk consisting of permutation-invariant chaimels: 

W^""'"^ = {py\x^ e Wk : PY\x^^^^ = Py\x^, Vtt} . (2.6) 
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The collusion channel PyiXk is said to be fair if Pr[py\-yf_^ € W^"""] = 1. For any fair collusion channel, 
the conditional type Py|xK is invariant to permutations of the colluders. 

Strongly exchangeable collusion channels [7]. Now denote by vr a permutation of the samples of a 
length- sequence. For strongly exchangeable channels, PY|XK;(^y|^^Ac) is independent of vr, for every 
(xj<;,y). The channel is defined by a probability assignment Pr\Ty\y-^] on the conditional type classes. 
The distribution of Y conditioned on Y G ^yixK is uniform: 

Pr[r |x^] 

PY|X;c(y|x/c) = -7^ p, Vy G Ty[x^. (2.7) 

C. Error Probabilities 

Let /C be the actual coalition and K, = qnC^, S"^, V) the decoder's output. The three error probabilities 
of interest in this paper are the probability of false positives (one or more innocent users are accused), 

PFp{fN,gN,PY\xJ = Pr[t \ /C / 0], 

the probability of failing to catch a single coUuder, 

PrUN,9N,PY\^^) = Pr[t n /C = 0], 
and the probability of failing to catch the full coalition: 

PfUN,9N,VY\i^^) = Pr[lC^K]. 

These three probabilities are obtained by averaging over S, V, and the output of the collusion channel 
Py|Xk;- QSiC^ case the worst-case probability is denoted by 

PeifN,9N,^K) = max Pe{fN,gN,PY\:>L^) (2-8) 

Py|Xk 

where Pe denotes either Ppp, P°^'^ or P^^^, and the maximum is over all feasible collusion channels, 
i.e., such that (12.41 ) holds. 



D. Capacity and Random-Coding Exponents 

Definition 2.2: A rate R is achievable for embedding distortion Di, collusion class Wk, and detect-one 
criterion if there exists a sequence of {N, [2^^]) randomized codes {fN,gN) with maximum embedding 
distortion Di, such that both P°^{fN,9N,^K) and PFP,N{fN,gN,'^K) vanish as ^ oo. 

Definition 2.3: A rate R is achievable for embedding distortion Di, collusion class Wk, and detect-all 
criterion if there exists a sequence of [N, [2^^]) randomized codes {/njOn) with maximum embedding 
distortion Di, such that both P^%{fN, gN,^K) and PFP,N{fN, gNi'^x) vanish as ^ oo. 
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Definition 2.4: Fingerprinting capacities C°"''^{Di,Wk) and C°'^\Di,Wk) are the suprema of all 
achievable rates with respect to the detect-one and detect-all criteria, respectively. 
For random codes the error exponents corresponding to ( I2.8I ) are defined as 



^{one,aU,FP}^j^^ Dl, = lim inf 



(2.9) 



We have ^^"(Di,^^) < C°"^(L»i,#x) and E'^^^R^Di.Wk) < E°'^''{R, Di,Wk) because an error 
event for the detect-one problem is also an error event for the detect-all problem. 

III. Overview of Random-Coding Scheme 

A brief overview of our scheme is given in this section. The decoders will be specified later. The 
scheme is designed to achieve a false-positive error exponent equal to A and assumes a nominal value 
Knom for coalition size. Two arbitrarily large integers and are selected, defining alphabets 
W = {1, 2, • • • , Lw} and U = {1,2, - ■ ■ , Lu], respectively. The parameters A, Knom, Lw,Lu are used to 
identify a certain optimal type class and conditional type classes T^^ga^{s'^,-w), T^|^^^(s,w) and 
TJl^^^(u, s, w) for every possible (u, s,w). Optimality is defined relative to either the thresholding 
decoder of Sec. |IV] or the joint decoder of Sec. |Vl The secret key V consists of a random sequence 
W € Tw* and the collection (13.11 ) of random codebooks indexed by s'^, w, A. 



A. Codebook 

A random constant-composition code 

C(s'^,w,A) = {u(/,m,A), 1 < / < 2^^W, 1 < m < 2^^} (3.1) 

is generated for each pair of sequences (s'^jw) G {S'^)^ x and conditional type A € =^^d^^ by 
drawing 2^[^+^(''')l random sequences independently and uniformly from an optimized conditional type 
class r^|gdp^(s'^, w), and arranging them into an array with 2^^ columns and 2^''^^) rows. Similarly to 
[11] (see Fig. 2 therein), we refer to p{\) as the depth parameter of the array. 

B. Encoding Scheme 

Prior to encoding, a sequence W G is drawn independently of S and uniformly from T^, and 
shared with the receiver. Given (S,W), the encoder determines the conditional type A = Psls^^w and 
performs the following two steps for each user 1 < m < 2^^. 
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1) Find I such that u(/, m, A) € C{s'^, w, A) f\ Tj^|^^(s, w). If more than one such / exists, pick one 
of them randomly (with uniform distribution). Let u = u(/,m,A). If no such / can be found, 
generate u uniformly from the conditional type class r^|^^(s,w). 

2) Generate uniformly distributed over the conditional type class T^|^^p^(u, s, w), and assign 
this marked sequence to user m. 

C. Worst Collusion Channel 

The fingerprinting codes used in this paper are randomly-modulated (RM) codes [10, Def. 2.2]. For 
such codes we have the following proposition, which is a straightforward variation of [10, Prop. 2.1] 
with S"^ in place of S at the decoder. 

Proposition 3.1: For any RM code (/at, (7Ar), the maximum of the error probability criteria (12.81 ) over 
all feasible Py|Xk achieved by a strongly exchangeable collusion channel, as defined in ( 12.71 ). 
To derive error exponents for such channels, it suffices to use the following upper bound: 

Pr[T i^J 1 

PY|X;c(y|x/c) = -nf p < yf 1 l{p^|^^er,,}, Vy € Ty,^^ (3.2) 

which holds uniformly over all feasible probability assignments to conditional type classes Ty^.^^. 

D. Encoding and Decoding Errors 

The array depth parameter p{X) takes the form 

/9(A) = /(u;s|s'^,w) + e 

where u is any element of T^|^^^(s, w), and e > is an arbitrarily small number. The analysis shows 
that given any (s,w), the probability of encoding errors vanishes doubly exponentially. 

The analysis also shows that the decoding error probability is dominated by a single joint type class 
^yusw- Denote by (y, u, s, w) an arbitrary representative of that class. The normalized logarithm of the 
size of the array is given by 

i? + p(A) = /(u;y|s^w) - A, 
and the probabiUty of false positives vanishes as 2~^^. 



Mai-ch 4, 2008 



DRAFT 



10 



IV. Threshold Decoder 

A. Decoding 

The decoder has access to (y, s'^, w) but does not know the conditional type A = Ps|s<'w reahzed at the 
encoder. The decoder evaluates the users one at a time and makes an innocent/guilty decision on each 
user independently of the other users. Specifically, the receiver outputs an estimated coalition K, if and 
only if K, satisfies the following condition: 

Vm e ^ : max max /(u(/, m, A); yls'^vi^) - p(A) > i? + A. (4.1) 

^^•^s\sdw 

If no such K, is found, the receiver outputs K, = %. This decoder outputs all user indices whose empirical 
mutual information score, penalized by p{\), exceeds the threshold + A. 

Observe that the maximizing A in (14.11 ) may depend on m. With high probability, this event implies a 
decoding error. Improvements can only be obtained using a more complex joint decoder, as in Sec. jV] 

B. Error Exponents 

Define the following set of conditional p.m.f.'s for {XU)k: — {Xjc, Ujc) given (5, W): 

M{pxu\sw) = {P{xu)k:\sw ■ Px„M^\sw = Pxu\sw, rn £ IC}, 

i.e., the conditional marginal p.m.f. Pxu\sw is the same for each {Xm,Um),yrn e JC. Also define the 
sets 

^xu\sw{Psw,L^,,Lu,Dl) = {pxu\sw ■ ^d{S,X)] < Di} , 

^(XU)k.W\s{PSiLw,Lu-,Di) = lp(XU)K.W\S=PwWpXkUk\SW 

I k&K. 

■ Px^u^\sw = ■■■= PXkUk\sw, and E[d{S,Xi)] < Di}(4.2) 

where in (14.21 ) the random variables {X^, Uk), k £ IC, are conditionally i.i.d. given (5, W). 
Define for each m £ IC the set of conditional p.m.f.'s 

'^Y{XU)f:\SwiPW,PS\WjPXU\SW^ ^K-, R, L^, Lu, m) 

= {PYixu)^\sw ■ Pixu)^\sw e -MiPxuisw), Py\x^ e '^K, 

■^pwps\wPY(xu)f^\swi^m]Y\S W) — Ip^^pg^^^-p^^^c:„{U^, S\S W) — -^j" (4-3) 
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and the pseudo sphere packing exponent 

Epsp,m{R,PW,Ps\W,PXU\SW^^K) = mm v.. o r r N 

Py{xu},(^\sw&£^y(xu)icISw (Pw ,Ps\w ,Pxu\sw ,R,L„,L^,m) 

D{py{xu)k \sw Ps\w\\py\x^ Pxu\sw PS \Pw)- (4.4) 
Taking the maximum and minimum of Epsp^m above over m G /C, we respectively define 

Epsp{R, Ly^,Lu,pw,Ps\WiPxu\sw^ '^k) = max£'p^p „j(i?, L^, Lu,pw,Ps\w^Pxu\sw^ ^)j(4.5) 

mG/C 

E iR,Ly,,Lu,PW,PS\W,PXU\SW,'^K) = m\llEpsp^rn{R, L^, Lu,pw ,Ps\w ,Pxu\sw ,^k)IA-^) 

For a fair coalition {'P'k = Wj^^"^), Epsp^m is independent of m G /C, and the two expressions above 
coincide. Define 

Epsp{R, Lw, Lu, Di,Wk) = max min max 

PwG^w- Psiw&£^s\w pxu\sw<^3^xu\sv/(pw ps\w,Lm,Lu,Di) 

Epsp,i{R, L^, Lu,pw,Ps\w,Pxu\sw, ^Z"*'")- (4.7) 

Denote by p^ and the maximizers in (14.71 ). the latter to be viewed as a function of Ps\w- Both 

p^ and P*xu\sw implicitly depend on R and W^"'^^ . Finally, define 

Epsp{R,Ly,,Lu,Di,WK) = min Epsp{R,Ly,,Lu,p*w^Ps\W^Pxu\SW^^K) (4.8) 

Ps\w&3's\w 

Epsp{R,Lui,Lu,Di,WK) = min E p{R, L^i, Lu,p*w,Ps\w,P*xu\sw^^k)- (4.9) 

PsiwSi^siw 

The terminology pseudo sphere-packing exponent is used because despite its superficial similarity to 
a real sphere -packing exponent, (14.41 ) does not provide a fundamental asymptotic lower bound on error 
probability. 

Theorem 4.1: The decision rule (14.11 ) yields the following error exponents. 

(i) The false-positive error exponent is 

Efp{R,D^,Wk,A) = A. (4.10) 

(ii) The detect-all error exponent is 

E"^^ {R,L^,Lu,DuWk,A) = Ep,p{R + A,L^,Lu,DuWk)- (4.11) 

(iii) The detect-one error exponent is 

E'^'^R, L^, Lu, Di, Wk, A) = Epsp{R + A, L^, L,, Di,Wk). (4.12) 

(iv) A fair collusion strategy is optimal under the detect-one error criterion: 

E^^^R, L^,Lu, Di,Wk, A) = E°^\R, L^, L„, A). 
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(v) The detect-one and detect-all error exponents are the same when the coUuders emply a fair 
strategy: L^, L„, Di, A) = L^, L„, Di, A). 

(vi) For K = Knom, the supremum of all rates for which the detect-one error exponent of (14.121 ) is 
positive is 

= lim max max min 

[/([/; W) - I{U- S\S'^, W)]. (4.13) 

V. Joint Fingerprint Decoder 

The fundamental improvement over the simple thresholding strategy for decoding in Sec. |IV] resides 
in the use of a joint decoding rule. Specifically, the decoder maximizes a penalized empirical mutual 
information score over all possible coalitions of any size. The penalty depends on the conditional host 
sequence type Psls^^w' in Sec.|IVl and is proportional to the size of the coalition, as in [10, Sec. V]. We 
call this blind fingerprint decoder the maximum doubly-penalized mutual information (M2PMI) decoder. 

Mutual Information of k Random Variables. The mutual information of k random variables Xi, - ■ ■ , 
is defined as the sum of their individual entropies minus their joint entropy [21, p. 57] or equivalently, 
the divergence between their joint distribution and the product of their marginals: 

°I{Xi---- -Xk) = H{Xi) + --- + H{Xk)-H{Xu--- ,Xk) (5.1) 
= D{px,-xA\px,---px^)- 

o 

The symbol / is used to distinguish it from ordinary mutual information / between two random vari- 

o 

ables. Similarly one can define a conditional mutual information I{Xi;--- ■,Xk\Z) = Y^- H(Xi\Z) — 

o 

H{Xi, ■ ■ ■ ,Xk\Z) conditioned on Z, and an empirical mutual information /(xi; • ■ ■ ;xfc|z) between k 
sequences xi, • • • ,Xfc, conditioned on z, as the conditional mutual information with respect to the joint 

o 

type of xi, • • • ,Xfc,z. Some properties of / are given in [10, Sec. V.A]. 

Recall that x_4 denotes {x^,, rn E A} and that the codewords in (13.11 ) take the form u(/, m, A). In the 
following, we shall use the compact notation (xu)_4 = (x_4,u^), and 

u{l_A,mA,X) = {u{lm,,mi,X), - ■ ■ , u(/m|^| , m|^|, A)} for A = {mi,-- - ,m\j^\}. 
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A. M2PMI Criterion 

Given y, s'^, w, the decoder seeks the coaUtion size k, the conditional host sequence type A G ^^sls'^w 
and the codewords u(/,m, A) in C(s'^,w,A) that maximize the M2PMI criterion below. The column 
indices m G /C, corresponding to the decoded words form the decoded coalition /C. If the maximizing k 
in ( I5.2I ) is zero, the receiver outputs JC = %. 

The Maximum Doubly-Penalized Mutual Information criterion is defined as 



meixM2PMI(k) 

k>0 



where 



M2PMI{k) 



max max 



/(uyc;y|sM-fc(/>(A) + i? + A) 



(5.2) 

if A; = 

if A; = 1,2,--- 

(5.3) 



B. Properties 

The following lemma shows that 1) each subset of the estimated coalition is significant, and 2) any 
further extension of the coalition would fail a significance test. The proof parallels that of Lemma 5.1 
in [10] and is therefore omitted. 

Lemma 5.1: Let /C, A, Ij^ achieve the maximum in (15.31) (15.21 ). i.e., = u(Z^, m^, A). Then for each 
subset of the estimated coalition /C, we have 

VACIC : /(u(U,m^,A);yu(/^^_^,m^^_^,A)|s^w) > |^|(p(A)+ii + A). (5.4) 

Moreover, for every A disjoint with /C, 

ruA, A); yu(/^, m^, A) js^'w) < |^I (p(A) + R + A). (5.5) 



C. Error Exponents 

Define for each A C IC the set of conditional p.m.f.'s 

^Y{XU)f:\Sw{PW,PS\W^PXU\SWi '^K , R, L^, Lu,A) 

- {PY{XU)k:\SW ■ PiXU)K:\SW ^ ■^iPXU\Sw)^ 

I o ^ d \ 

J^^^PwPsiwPy[xu)^ISw{UAi'^Uic\a\S ^ ^Pw Psiw Pxulswi^'^ ) W^) + ^ J (5-6) 
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and the pseudo sphere packing exponent 

Epsp,AiR,Lw,Lu,PW,PS\W^PXU\SW^'^K) = _ min ^y/ d r r ^^ 

Pi'CXLfj/c \ sw(Pw ,Ps\w ,Pxu\sw ^'tk ,R,L'^,L,j^,A) 

D{py(xu)^\sw Ps\w\\vy\x,c V(xu)k\swPs |pw/)-(5.7) 
Taking the maximum Q and the minimum of Epsp,A above over all subsets A of /C, we define 

Epsp{R,Lw,Lu,pw,Ps\w^Pxu\sw^'^K) = Epsp^ic{R,Luj,Lu,pw,Ps\w^Pxu\sw^'^K), (5.8) 

Ep^p{R,Li,,Lu,pw,Ps\w,Pxu\sw,^K) = min Epsp,A{R,Lw,Lu,pw,Ps\w,Pxu\sw,^K) -(5 -9) 
Now define 

Epsp{R, Lw, Lu, Di,Wk) = max _ min max 

Pw'^-'^w Ps\w&^s\w Pxu\sw&S^xu\sw(pw,Ps\w,L^,Lu,Di) 

Epsp,K:{R,Ly,,Lu,pw,Ps\w,Pxu\sw, '^iZlJ- (5.10) 

Denote by and P*xu\sw '■^^ maximizers in (15.10b . where the latter is to be viewed as a function of 
Ps\w- Both and p*xi/\sw implicitly depend on R and Wj^"'^^ . Finally, define 

Epsp{R,Luj,Lu,Di,WK) = min Epsp{R, L^, Lu,Pw,Ps\w,P*xu\sw^^k), (5.11) 

Psiiv&£^siw 

Epsp{R,Lyj,Lu,Di,WK) = min E p{R, L^^, Lu,Pw,Ps\w,P*xu\sw^^k)- (5.12) 

Theorem 5.2: The decision rule ( 15.21 ) yields the following error exponents. 

(i) The false-positive error exponent is 

Efp{R,Di,Wk,A) = A. (5.13) 

(ii) The detect-all error exponent is 

E'^" {R,L^,Lu,Di,Wk,A) = Ep^p{R + A,L^,Lu,Di,Wk). (5.14) 

(iii) The detect-one error exponent is 

E°^^{R, L^,Lu, Di,Wk, A) = Ep,p{R + A, L^, D^Wk). (5.15) 



(iv) E"^%R,L^,L^,Di,Wk,A) = E^^%R,L^,L^,Di,W^^'',A). 
(V) E''''iR,L^,Lu,Di,Wj(''''',A) = E"^'iR,L^,Lu,Di,Wi''''',A). 



* The property that IC achieves max^cK Epap,A is established in the proof of Theorem 15.21 Part (iv). 
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(vi) If K = Knom, the supremum of all rates for which the error exponent of (15.151 ) and (15.141 ) are 
positive is 

= lim max max min 

^I{U,c;Y\S',W)-IiU;S\S'',W) (5.16) 
under the "detect-one" criterion, and by 

C^^\Di,Wk) = lim max max min 

L„,L„^oo pwG^w P{xu}^\sw<^.^(xu)i^\sw(pw ,Ps,L^,L^,Di) Py\Xi^€'^k 

(5.17) 

under the "detect-all" criterion. If the coUuders select a fair collusion channel, as is their 

K 

all I 



mm ^^liUx, Y W, Ujc\^) - /([/; S\S^, W) 



collective interest, the minimization is restricted to in (15.17b . and then 



For the special case of private fingerprinting (S"^ = 5), the term I{U; S\S'^,W) in (15.161 ) is zero. 
Since I{Uic;Y\S,W) < Ii{XU)ic;Y\S,W), it suffices to choose L„ = \X\ and U = X to achieve the 
maximum in (15.161 ). The resulting expression coincides with the capacity formula in [10, Theorem 3.2]. 
Similarly to the single-user case [11], when U = X the binning scheme is degenerate. 

D. Bounded Coalition Size 

Assume now that K is known not exceed some maximum value K^ax- The same random coding 
scheme can be used. In the evaluation of the M2PMI criterion of ( 15.21 ). the maximization is now limited 
to < k < -f^max- In Lemma [STTl property (15.41 ) holds, and property (15.51 ) now holds for every A disjoint 
with }C, and of size |^| < Kmax — 1^1- Following the derivation of the eiTor exponents in the appendix, 
we see that these exponents remain the same as those given by Theorem 15.21 

Blind watermarking. The case i^max = 1 represents blind watermark decoding with a guarantee that 
the false-positive exponent is at least equal to A. In this scenario, there is no need for a time-sharing 
sequence w, and the decoder's input y is either an unwatermarked sequence (K = 0) or a watermarked 
sequence (K = 1). The M2PMI criterion of (15.31 ) reduces to 



M2PMI(k) = max max /(u; y Is'^) - (p(X) + R + A) for A; = 1. 

A ueCis'') 

The resulting false-positive and false-negative exponents are given by A and Epsp{R+A, 0, L„, Di, 
respectively. 
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VI. Upper Bounds on Public Fingerprinting Capacity 

Deriving public fingerprinting capacity is a challenge because the capacity region for the Gel'fand- 
Pinsker version of the MAC is still unknown, in fact an outer bound for this region has yet to be 
established. Even in the case of a MAC with side information causally available at the transmitter but 
not at the receiver, the expressions for the inner and outer capacity regions do not coincide [23]. Likewise, 
the expression derived below is an upper bound on public fingerprinting capacity under the detect-all 
criterion. 

Recall the definition of the set ^(^xu)k:W\s{ps, L^, Lu, Di) in (14.21 ). where W and U are random 
variables defined over alphabets W = {1,2,- •• ,L^ai} and U = {1,2, ••• ,Lu}, respectively. Here we 
define the larger set 



^ixuuw\siPs^L^^Lu,Di) = <P(xu)^w\s=Pw IYIpx,\sw\pu^\x^sw 

{ \keK / 

Px^\sw = ■ ■ ■ = PXk\sw, and E[d{S,Xi)] < Di} (6.1) 



where X^, E /C, are still conditionally i.i.d. given {S,W) but the random variables Uk, k G JC, are 
generally conditionally dependent. 
Define 



^ozz ^ (Z)]^,#j^) = max min 

min ^ [l{UA;Y,S''\U,c\A)-nUA;S\U^\^] 



(6.2) 



Using the same derivation as in Lemma 2.1 of [11], it can be shown that C'^l ^ {Di, Wk) is a nonde- 
creasing function of Lw and Lu and converges to a finite limit. Moreover, the gap to the limit may be 
bounded by a polynomial function of and L^, see [11, Sec. 3.5] for a similar derivation. 
Theorem 6.1: Public fingerprinting capacity is upper-bounded by 

C""(Z)i,#k)= lim cHlSDi^Wk) (6.3) 

under the "detect-all" criterion. 
Proof, see appendix. 

We conjecture that the upper bound on capacity given by Theorem 16.11 is generally not tight. The 
insight here is that the upper bound remains valid if the class of encoding functions is enlarged to 
include feedback from the receiver: X^j = /j(S, M^, y*~^) for 1 < i < A^. It can indeed be verified 
that all the inequalities in the proof and the Markov chain properties hold. The question is now whether 
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feedback can increase public fingerprinting capacity. We conjecture the answer is yes, because feedback 
is known to increase MAC capacity [24]. 

We also make the stronger conjecture that the maximum over P(xu)k\SW achieved by a p.m.f. that 
decouples the components {Xk, Uk), k £ JC, conditioned on {S, W). If this is true, the set ^°xu)k:W\S^Ps ' 
Lw, Lu, Di) in the formula (16.21 ) can be replaced with the smaller set ^{xu)icW\s{pSj L^, L^, Di) of 
(14.21 ). and the random coding scheme of Sec. |V] is capacity-achieving. 

VII. Conclusion 

We have proposed a communication model and a random-coding scheme for blind fingerprinting. While 
a standard binning scheme for communication with asymmetric side information at the transmitter and 
the receiver may seem like a reasonable candidate, such a scheme would be unable to trade false-positive 
error exponents against false-negative error exponents. Our proposed binning scheme combines two ideas. 
The first is the use of a stacked binning scheme as in [11], which demonstrated the advantages (in terms 
of decoding error exponents) of selecting codewords from an array whose size depends on the conditional 
type of the host sequence. The second is the use of an auxiliary time-sharing random variable as in [10]. 
The blind fingerprint decoders of Sees. |IV] and |V] combine the advantages of both methods and provide 
positive error exponents for a range of code rates. The tradeoff between the two fundamental types of 
error probabilities is determined by the value of the parameter A. 
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Appendix I 
Proof of Theorem 14.11 

We derive the error exponents for the thresholding rule (14. lb . We have W = {1,2,- •• and 
U = {1,2, ■ ■ ■ ,Lu}- Fix some arbitrarily small e > 0. Define for all m € /C 



Epsp.m,N{RiLw,Lu,Pvi,Ps\vfiVyi\i\svfi^K) — min. 

Y(_XU)y^ \SW 



(xu)k;|sw 

\\Py\ IPsw), (A.l) 

Wk) = D{Ps\^\\ps\p^k) + Epsp^m,N{R-,Lw-,Lu, 

^k) 

= min 

I^(Py(xu).|swPs|wl|Py|x.pL|swP5bw),(A.2) 
Epsp,N{Ri Lw, Lu,PwtPs\wiPxu\sv/^ = T^^]<: Epsp^m,N{R, L^, -t'u, Pw; Pslw) Pxulswi '^k) 

(A.3) 

Erts-pNiR-i ^w, Lu,Pw,Ps\wtPxu\swj '^k) = mill Epsp^m,N{R^ L^, -C'u,Pw)Ps|w)Pxu|sw) ^^^) 

mG/C 

(A.4) 

where (IA.2I) is obtained by application of the chain rule for divergence. Also define 

Epsp n{R-,Lw,Lu,Di,Wk) = max min max 

Epsp,l,N{R^L^,Lu,P^N^Ps\v^^P->^■a\sv^^^K )' (A. 5) 



Denote by and p*y|s^ the maximizers above, the latter viewed as a function of Ps\^f,■ Both maximizers 
depend implicitly on R and 'W^^^ . Let 

Epsp^n{R,Lw,Lu,Di,Wk) = min Epsp^N{R,Lyj,Lu,pl,,Ps\^,p*^^\^^), (A.6) 
Epsp^N{R^Lyj,Lu,Di,WK) = min^^ Sp^p^^(i?, L^„, Ln,Pw>Ps|w,Pxu|sw)- (A-7) 

Ps|wG^S|yi, 
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The exponents (IA.2I ) — (IA.7I ) differ from (14.41) — ( I4.9I ) in that the optimizations are performed over condi- 
tional types instead of general conditional p.m.f.'s. We have 

lim Epsp n{R,Lw,Lu,Di,Wk) = Epsp{R, Ly,, L^, Di,Wk) (A.8) 
lim R ^{R,L^,Lu,Di,WK) = Ep,p{R, L^, L,,, Di,Wk) (A.9) 

by continuity of the divergence and mutual-information functionals. 

Consider the maximization over the conditional type Pxujsw (IA.5I ). As a result of this maximization, 
we may associate the following: 

• to any (s,w), a conditional type class r^|^p^(s,w) = T* 



to any (s'^jw), a conditional type class T/}, odw(s'^, w) = T*, 



^(/ISWV"' ~ ^u|sw 

. to any (s,w) and u € Tlj^g^{sw), a conditional type class T^^^g^{u,s,w) = T*^^^^; 

• to any type ps-w, a conditional mutual information -^^^i^d^^few) — /(u; sjs'^, w) where u, s, w are 

any three sequences with joint type P* ig^Psw- 
Codebook. Define the function 

A random constant-composition code 

C{s'^,w,Ps\s-'^v) = {u(/,m,ps|sd^), 1 < Z < exp2{iVp(Ps|s<iw), I <m< 2^^} 

is generated for each s'^ € {S'^)^, w G T^, and Ps|s''w ^ ^^sis-'W drawing exp2{iV(-R + p(ps|s''w))} 
random sequences independently and uniformly from the conditional type class Tj^|^d^y (s'^, w), and 
arranging them into an array with columns and exp2{A^p(Ps|s''w)} rows. 

Encoder. Prior to encoding, a sequence W € is drawn independently of S and uniformly from 
T^, and shared with the receiver. Given (S, W), the encoder determines the conditional type Ps|s'*w and 
performs the following two steps for each user 1 < m < 2^^. 

1) Find I such that u(/, m,pg\gd^) G C(S'^, W,pg|gd^) f] T^|^^^(s, w). If more than one such / exists, 
pick one of them randomly (with uniform distribution). Let u = u(/, m,Ps|sd^). If no such / can 
be found, generate u uniformly from the conditional type class T^|g^(s,w). 

2) Generate uniformly distributed over the conditional type class TJ|j^g^y(u, s, w). 
Collusion channel. By Prop. 13.11 it is sufficient to restrict our attention to strongly exchangeable 

collusion channels in the error probability analysis. 

Decoder. Given (yjS'^, w), the decoder outputs IC if and only if (14.11 ) is satisfied. 
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Encoding errors. Analogously to [11], the probability of encoding errors vanishes doubly exponentially 
with N because p(ps|s<'w) > -^(u; s|s'^w). Indeed an encoding error for user m arises under the following 
event: 

£m = {(C,s,w) : (u(/,m,i5,|,.^)GCandu(/,m,j^,|,.w)^7^i}|5H/(s,w))for l</<2^''(P=i=''w)}. 

(A. 10) 

The probability that a sequence U uniformly distributed over T^^g^^{s'^, w) also belongs to T^^g^{s, w) 
is equal to exp2{— A^/^^|5.d^(psw)} on the exponential scale. Therefore the encoding error probability, 
conditioned on type class Tgw, satisfies 

= (1 — 2 usisdw^P'^^'Y ' 

< exp{- exp2(iV[p(ps|s''w) - lusis-'wiPs^)])} 

= exp{-2^"} (A. 11) 

where the inequality follows from 1 — a < e"*^. 

The derivation of the decoding error exponents is based on the following two asymptotic equalities 
which are special cases of (IC.2I ) and (IC.5I ) established in Lemma 13.11 

1) Fix y,s°',w and draw u uniformly from some fixed type class, independently of (yjS'^jw). Then 

Pr[/(u;yls'^w) > z^] = 2"^''. (A.12) 

2) Given s,w, draw (x/j,Ufc), k £ IC, i.i.d. uniformly from a conditional type class T^uisw then 
draw y uniformly over a single conditional type class Ty\^^. For any > 0, we have 

^k)}- (A.13) 

(i). False Positives. From (14.11 ). the occurrence of a false positive implies that 

3A € ^^rn^JC : I{u{l, m, A); yjs'^w) > p{X) + R + A. (A.14) 

By construction of the codebook, u(/,m, A) is independent of y for m ^ /C. For any given A, there are 
at most 2^''*^^^ possible values for / and 2^^ — K possible values for m in (IA.14I ). Hence the probabiUty 
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of false positives, conditioned on the joint type class Ty(xu)Ksw' is 

(xu)k;SW) 

< ^(2^^ - K) 2^''W Pr[I(u(/, m, A); yjs'^w) > p{X) + R + A] 

X 

(a) 

A 

< (AT + 1)151^^2-^^ 

= 2-^^ (A.15) 

where (a) is obtained by application of (IA.121) with u = p{X) + R + A, and (b) because the number of 
conditional types A is at most (N + l)''^' 

Averaging over all type classes Ty(xu)K;sw' we obtain Ppp < 2^^^, from which (14.101 ) follows. 

(ii). Detect-One Error Criterion (Miss All CoUuders). We first derive the error exponent for the event 
that the decoder misses a specific coUuder m £ JC. Any coalition IC that contains m fails the test (14.11) . 
i.e., for any such IC, 

VAG^li|3,^: max/(u(/,m,A);y|s°'w) < p(A) + A. (A.16) 

This implies that 

/(u(/,m,ps|sdw);y|s'^w) < pip^i^^^) + R + A (A.17) 

where / is the row index actually selected by the encoder, and Psls-^w is the actual host sequence conditional 
type. The probability of the miss-m event, given the joint type PwPs|wPxu|sw' therefore upper-bounded 
by the probability of the event dA. 171 ): 

Pmjss-m(Pw>Ps|w,Pxu|sw'^A') < Pr I {u{l , m, Ps\s<i^);y\s'^w) < p{ps\s<i^) + R + A 

(a) 

< exp2 ^-NEpsp,m,N{R + A, L^, Lu,pI,,Ps\v,,P^u\sv,^^k)^ 
where (a) follows from (IA.13b with u = R + A. 
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The miss-all event is the intersection of the miss-m events over m € /C. Its conditional probability is 

Pmiss—all w ' Ps\wi Pxulsw ' 



Pr 



n{ 

.meK. 



miss m I p^,Ps\^,p^ 



\ mm Pmiss—m (pl 



exp2 \ -N ma,xEpsp^rn,N{R + L^, Lu,pl,,Ps\^v^P'^u\svr^'^K] 



Averaging over S, we obtain 



(A. 18) 



Pmiss-all[»^K) 

wj Pmiss—all iPl 

Ps|w 

= max Pr[Ts\^]pmiss-all{pl,,Ps\^^,,P*^u\s^v^^K) 



(a) 



maxexp2 < — 



D{Ps\^^,\\ps \ pI,) + 'Caa^Epsp,rn,N{R + ^, Lu,, Lu,pI,,Ps\v^,P*^^i^^,Wk) 



exp2 {-NEpsp^N{R + A,Lw,Lu,Di,Wk)} 



= exp2 {-NEpsp{R + A, L^,L^,Di,Wk)} 



which establishes (I4.12I ). Here (a) follows from (IC.3I ) and (IA.18I ). (b) from (IA.3I) and (IA.6I) . and (c) from 
(iii). Detect-All Error Criterion (Miss Some CoUuders). 

The miss-some event is the union of the miss-m events over m G IC. Given the joint type Ps|w Pxu|sw' 
the probability of this event is 



Pmiss—some\P-<fji Pslwi Pxulsw' 

= Pr 

< 



(A. 19) 



(J |miss m I Pw>Ps|w,Pxu|sw 

.mEE/C 

Pmiss-m{p 
me/C 

maxexp2 | -NEpsp^m,N{R + A, L^, Lu,pI,,Ps\^,P^^Is^,^k) \ 



exp2 <^ -N mmEpsp,m,N{R + A, Li,, Lu,pl„Ps\^^;,P*^u\sv,^'^K] 



(A.20) 
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Averaging over S, we obtain 

Pmiss—somei'^K^ 

wj Pmiss—some 



Ps 

(a) 



max exp2 < — 



Ps 



^(Ps|wl|P5 bw) + min£;p5p 

mGK. 



< exp2{-NEp,p^^iR + A,Di,WK)} 
= exp2{-NEp^p{R + A,L^,Lu,Di,WK)} 
which establishes (14. HI ). Here (a) follows from (IC.3I ) and ( IA.20I ). (b) from (IA.7I ) and (IA.4I ). and (c) from 

(iv) . Fair Collusion Channels. The proof parallels that of [10, Theorem 4.1(iv)], using the conditional 

divergence D{py(xu)^\sw Ps\w\\py\Xk,Pxu\swPs \pw) in place of D{pyx^\w\\py\x^Px\w \pw)- 

(v) . Immediate, because Epgp = E_psp this case. 

(vi) . Positive Error Exponents. From Part (v) above, we may restrict our attention to Wk = 
Consider any W = {1, • • • , L^} and pw that is positive over its support set (if it is not, reduce the value of 

accordingly.) For any m G /C, the minimand in the expression ( 14.41 ) for Epsp^miR, L^, Lu,pw,Pxu\SW' 
#^™^) is zero if and only if 

PYixu)K\swPs\w = Py\x^Pxu\swPs, with Py\x^ € W^'"'''. 

Such {py(xu)k\sw^Ps\w) is feasible for (|43]l if and only if {pxu\SWiPy\Xk) is such that I{Um; Y\S'^, W) 
< I{Um', S\S'^,W) + R. It is not feasible, and thus a positive exponent E°"''^ is guaranteed, if i? < 
I{Ui;Y\S'^,W) - I{Ui;S\S'^,W). The supremum of all such R is given by (I4l3l) and is achieved by 
letting e ^ 0, A — > 0, and L^, Lu ^ oo. □ 
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Appendix II 
Proof of Theorem I5.2I 

We derive the error exponents for the M2PMI decision rule (I5.2I ). Define for all ^ C /C 

I(u^;yu^\^|s'^w) < |^|(p(Ps|s^w) + ^)} (B.l) 
Wk) = min 

(xu)K|swl|Py|xK; ^'xulsW bsw), (B.2) 
Epsp,A,NiR,Lw,Lu,P^v,Ps\w,Pxu\sw,^K) = -D(ps|wl|P5 |Pw) + Epsp^A,N{R, L^, , Lu, 

= min 

Py(>cu)c|sw e .3i^l,"l,f„)^|S,v(Pw,Ps|w,Pxu|sw,#W,R,i™,i„,.4) 

^(Py(xu)^|swPs|wl|Py|x;c^'xu|swP5 bw), (B.3) 
Epsp,N{R, Lw, Lu,P-wtPs\-wjP:x.u\sv/j = Epgp.lC^N {Ri Lyj, iu, Pwj Ps|wi Pxu|swi (B.4) 

E_psp^n{R-> ^w, Lu,Pv/,Ps\-wiPxu\s-wj '^k) = HUH Epsp^_A_,N{R, -^W); -^mi Pw, Ps|w5 Pxu|sw) ^'^); 

(B.5) 

Epsp n{R-,Lw,Lu-,Di^Wk) = max min max 

PwG^l^' p.|wG-f3^[i;i,p.„|swG-f3^^"Jls„,(p„Ps|w,L™,L„,Di) 
Epsp^K,N{,Ri -t'li, Pw; Ps|w; Pxu|sw; '^Kr.om^' 

(B.6) 



Denote by and p*u|g^ the maximizers in (IB.6I ). the latter viewed as a function of Psjw Both 
maximizers depend implicitly on R, D\, and W^^'^ . Let 

Epsp^NiR^Lyj^Lu^Di.WK) = min£;psp,Ar(i?, L^,L„,p^,Ps|vv,p*u|sw'^i^) (B-7) 



Ps|v 



^/^). (B.8) 

The exponents (IB.3I ) — (IB.8I ) differ from (15.71 ) — (15.121 ) in that the optimizations are performed over 
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conditional types instead of general conditional p.m.f.'s. We have 

lim Epsp^n{R,Lw,Lu,Di,Wk) = Epsp{R, L^, L^, Di,Wk) (B.9) 

TV^oo 

lim Ep^pn{R,L^,Lu,Di,Wk) = Ep^p{R, L^, L^, Di,Wk) (B.IO) 

by continuity of the divergence and mutual-information functionals. 

The codebook and encoding procedure are exactly as in the proof of Theorem |IVl the difference being 
that and P*u|s^ are solutions to the optimization problem (IB.6I) instead of (IA.5I) . The decoding rule 
is the M2PMI rule of CT . 

To analyze the error probability for this random-coding scheme, it is again sufficient to restrict our 
attention to strongly-exchangeable channels and use the bound (13.21) on the conditional probability of the 
collusion channel output. We also use Lemma 13.11 

(i). False Positives. By application of ( 15.41 ). a false positive occurs if ^ \ /C 7^ and 

,m^, A);yu(/^^_^,m^^^,A) |s w) 
>\A\{p{X) + R + A). (B.ll) 
The probability of this event is upper-bounded by the probability of the larger event 

\iA<^K, : 3\,l^: /(u(Z^,m^,A);yu(/^^^,m^^^, A) |s w) 

>\A\{p{\) + R + A). (B.12) 

Denote by p*|gd^ the conditional type of the host sequence and by l'^ the row indices selected by the 
encoder. To each triple {K,,\,l£), we associate a unique subset B oi K,r\K, defined as follows: 
. If A 7^ p*| , then B = % 

• If A = Ps|sd^ then B is the (possibly empty) set of all indices A; € /C H /C such that 4 = Z^. 
Thus B is the set of coUuder indices /c € /C for which the decoder correctly identifies the conditional host 
sequence type p*|sdw and the codewords u(Z^, k,p*^^^a^) that were assigned by the encoder. Denoting by 
Q.{B) the set of pairs associated with B, we rewrite (IB. 12b as 

C : 3i3 C n /C, 3(A, l^) G Vt{B) : 

/(u(U,m^,A);yu(/^^_4,m^^^,A)|s^w) > \A\{p{X) + R + A). (B.13) 

Define the complement set .A = ^ \ ;B which is comprised of all incorrectly accused users as well as 
any coUuder k such that A 7^ Psis^w ^k- Since B <^ IC and there is at least one innocent user 

in /C, the cardinality of A is at least equal to 1. By construction of the codebook and definition of A 
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and B, u{l_A,m_A, X) is independent of y and u{l'^,ms,p*^^a^)- The probability of the event (IB.13I ) is 
upper-bounded by the probability of the larger event 

3B C /C, 3X, U, ruA : /(u(/^, m^, A); yu{l*s,mB,pl\,.J |s^w) > I^I(p(A) +R + A). (B.14) 

Hence the probability of false positives, conditioned on ry(xu)Ksw' satisfies 

(xu)kSW) 

= Pr y IJ i 3A,U,m^ : /(u(/^,m^, A);yu(/^,mB,p*|g,^) Is'^w) 
_BCK.\A\>1 ^ 

> \A\ipiX)+R + A)}] 

^ E E ^fi,l-4l(7^y(xuksw,^i^) (B.15) 

BCK \A\>1 

where 

PB,\A\iTy{^u)^s^v,^K) = Pr[3X,lA,mA ■ I(u(/^, m^, A); yu(ZJ, mB,p*|grf^) |s w) 

>\A\{piX)+R + A)]. (B.16) 

By definition of B, there are at most J2xj^p ^ 2^1-^1 p^'^'> possible values for Ia and 2^l'^l^ possible 
values for tua in (IB.16b . Hence 

-fB,|^|(^y(xu)KSW)^) 

< j;2^l-^l(^+''W)Pr[/(u(/^,m^,A);yu(/^,mB,i^:|,.J|s'w) > |^| (p(A) + i? + A)] 

A 

(a) 

A 

< (AT + 2-^1-41^ 

= 2-^1-41^ (B.17) 

where (a) is obtained by application of (IC.2I ) with yu(Zg, mg,^*!^^^) in place of z. 
Combining (IB.15I) and (IB. 17b we obtain 

PFp(Ty(.u).sw,^x) < J] J] 2-^1-^1^ 

B<ZK \A\>1 

= 2-^^. 

Averaging over all joint type classes Ty(xu)Ksw' we obtain Ppp < 2~^^, from which (15.131 ) follows. 
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(ii). Detect-All Criterion. (Miss Some CoUuders.) 

Under the miss-some error event, any coalition fC that contains K, fails the test. By (15.41 ). this implies 



^^^^s\S''W- 3^ C ^ : max/(u(/^,m^,A);yu(Z^^^,m^^^, A) [s'^w) 



<\A\{p{X) + R + ^). (B.18) 

In particular, iox K, = K, we have 

3^ C /C : /(u(/^, m^,ps|sd^); yu(/x;\^, mx.\A^Ps\s'^vv) |s w) < {p{Ps\s-^vv) + + A). (B.19) 

where Ik: are the row indices actually selected by the encoder, and Ps|s'*w is the actual host sequence 

conditional type. The probability of the miss-some event, conditioned on (s,w), is therefore upper 
bounded by the probability of the event (IB. 191 ): 

Pmiss—som,e\P•w^Ps\■w^P•!^\l\sv 



< Pr 



ACK 



< ^ Pr 

ACK 

(a) 



(J |/(u(/^,m^,Ps|sdw);yu(//c\^,m/c\^,Ps|sdw) Is'^w) < lAlipiPsis"^) + R + A) 
/(u(;^,m^,Ps|sd^);yu(//c\yi,m/c\^,Ps|sd^) [s w) < \ A\{p{ps\s-i^) +R + A) 



< ^ exp2\^-NEpsp,A,NiR + A,Lw,Lu,pl„Ps\^,p^^^^^,WK)'j 

ACK. 

™^^^^P2 ^~^^psp,A,n{R + A, L^, Lu,p^,Ps|^,p^^|g^, 



ACK 

exp2<{ -N minEpsp,A,N{R + A,Ly,,Lu,pl,,Ps\v,,P^u\sv,^^K) } (B.20) 



where (a) follows from (IC.5I ) with u = R + A. 
Averaging over S, we obtain 

Pmiss—some{'P'K) 

~ ^ ^ P^['^s\'w]Pmiss—some{P-wTPs\-WTPxu\sw^^K) 

Ps|w 

(") r 

= maxexp2 <^ -iV[^(Ps|wl|Ps I Pw) + min £;p5p,Ar(i? + A, L,p;,,Ps|^,p*^|g^, 

Ps I w I — A-- 

max exp2 I - NE^^^^^ {R + A, L^,, Lu,pl,,Ps\v,,P*y,u\s^v^^K)} 

exp2{-NEp,p^N{R + A, L^,Lu, Di,Wk)} 
(d) 

= exp^{-NEp^piR + A,L^,L^,Di,WK)} 
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which proves (I5.14I ). Here (a) follows from (IC.3I ) and (IB.20b . (b) from the definitions (IB.5I ) and (IB.3I) . 
(c) from (IB.8I ). and (d) from the limit property (IB.IOI ). 

(iii) . Detect-One Criterion (Miss All CoUuders.) Either the estimated coalition K is empty, or it is a 
set Z of innocent users (disjoint with /C). Hence P"'""^ < Pr[K, = 0] + Pr[iC = T]. The first probability, 
conditioned on (s'^jw), is bounded as 

Pr[iC = 0] = Pr[V/C' : M2PMI{K') < 0] 
< Pr[M2PMI{K.) < 0] 

= Pr[/(uc; yls'^w) < /^(/9(Ps|s^w) + + A)] (B.21) 

r - 1 

where (a) follows from (IC.5I ) with = + A. To bound Pr[/C = T], we use property (15.51) with IC = I 
and ^ = /C, which yields 

/(u^; yuils'^w) < K{p{p^\^.^) + R + A). 

Since 

o o o 

/(u;c;yuj[s'^w) = /(u/c;y|s'^w) + /(u^;; uijys'^w) > /(u^;y[s'^w) 
combining the two inequalities above yields 

/(u^; yjs-^w) < i^(/9(Ps|s<^w) + ^ + A). 
The probability of this event is again given by (IB.21I) : we conclude that 

Pmiss—all ) Pw 1 Ps|w) ^xulsw ' 

Averaging over S and proceeding as in Part (ii) above, we obtain 

Pmiss-alli'^K) < ^ Pr[Ts\ wj Pmiss—all {Pl 

Ps|w 

= exp2{-NEpsp{R + A,Lu„Lu,DuWk)] 

which establishes ( I5.15I ). 

(iv) . Optimal Collusion Channels are Fair. The proof parallels that of [10, Theorem 4.1(iv)] and is 
omitted. 

(v) . Detect-All Exponent for Fair Collusion Channels. The proof parallels that of [10, Theo- 
rem 4.1(v)] and is omitted. 
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(vi). Achievable Rates. Consider any W = {1, • • • , L^} and pw that is positive over its support set 
(if it is not, reduce the value of accordingly.) For any A'^ IC, the divergence to be minimized in the 
expression for Epsp^A{R, L^, Lu,Pw,Ps\W^Pxu\sw^^k) is zero if and only if 

PY{XU),c\SW = Py\x^ Pxuisw and ps\w = PS- 



These p.m.f.'s are feasible for ( I5.6I ) if and only if the inequality below holds: 

j^^i{Ua; yu,c\a\s'', w) > i{U; SIS'", w) + r. 

They are infeasible, and thus positive error exponents are guaranteed, if 



R < mm jj^HUa; W) - I{U; W). 



From Part (iv) above, we may restrict our attention to Wk = under the detect-one criterion. 

Since the p.m.f. of {S,W, {XU)x:,Y) is permutation-invariant, by application of [10, Eqn. (3.3)] with 
(Uic,S'^) in place of {Xic,S), we have 

mm j^^HUa; YUjc\a\S''W) = ^I{U^- Y\S''W). (B.22) 

Hence the supremum of all R for error exponents are positive is given by C°"'^(Z)i, Wk) in (15.161 ) and 
is obtained by letting e — > 0, A ^ and Lyj,Lu oo. 

For any Wk, under the detect-all criterion, the supremum of all R for which error exponents are positive 
is given by C^^\Di^ '^k) in (15.17b and is obtained by letting e — > 0, A ^ and L^^,, L„ — > oo. Since 
the optimal conditional p.m.f. is not necessarily permutation-invariant, (IB. 221 ) does not hold in general. 
However, if Wk = (lR22l) holds, and the same achievable rate is obtained for the detect-one and 

detect-all problems. □ 

Appendix HI 

Lemma 3.1: 1) Fix (s'^,w) and z G Z'^ , and draw u/c = {u^, m G /C} i.i.d. uniformly over a 
common type class T^^\gd^, independently of z. We have the asymptotic equality 

p^^rp ^ ^ J ^ I^UkIzS^wI ^ r,-N\KH(u\s^Vf)-H(uK\zs''Mv)] ^ Q^^^^^'C I ^"'"^^ (CI) 



- U S'^W I 



Pr[/(uc;z|s'^w) > I/] = 2-^". (C.2) 
2) Given w, draw s i.i.d. ps- We have [21] 

Pr[T^U = 2-^^(P=i"IIPslPw). (C.3) 
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3) Given (s,vif), draw (xfc,Ufc), k G }C, i.i.d. uniformly from a conditional type class T^uisw ^nd then 
draw Y uniformly from a single conditional type class Ty|x^. We have 

l^y|(xu)K;swl I^(xu)k; |sw I 



^?'[7V(xu)^lsw] 



IT I I IT I l-f^ 
l-'-ylxKl [-"-xulswl 



= exp2 {-iV-D(Pyx;c|swl|Py|x;c Pxu|sw bsw)} . (C.4) 

For any feasible, strongly exchangeable collusion channel, for any ^ C /C and i/ > 0, we have 
Pr [/(u^;yu/c\^ I s w) < \A\{u + p{ps\s-iv^))] 

= exp2 |-iV£;psp,AAf(2^'^'Pw'Ps|w,Pxu|sw'^^)} • 

Proof: The derivation of (IC4l) . (|C3]) . and (Oil parallels that of (D.12), (D.15) and (D.16) in [10]. 

Appendix IV 
Proof of Theorem 16. II 

Let K be size of the coalition and {fN,9N) a sequence of length- A^, rate-i? randomized codes. We 
show that for any sequence of such codes, reliable decoding of all K fingerprints is possible only if 
R < C"'''\Di,Wk)- Recall that the encoder generates marked copies = /Ar(s, v, m) for 1 < m < 2^^ 
and that the decoder outputs an estimated coalition gj\[{y,s'^,v) G {1, • ■ ■ ,2^^}*. We use the notation 
^ {Ml, • ■ ■ , Mk} and X^' ^ {Xi, • • • , X^^}. 

To prove that C'^''(Di, Wk) is an upper bound on capacity, it suffices to identify a family of collusion 
channels for which reliable decoding is impossible at rates above C°'^\Di, Wk)- As shown in [10], it is 
sufficient to derive such a bound for the compound family Wk of memoryless channels. 

Our derivation is an extension of the single-user compound Gel'fand-Pinsker problem [11] to the 

multiple-access case. A lower bound on error probability is obtained when an oracle informs the decoder 

that the coalition size is at most K. 
( 2^^ \ 

There are < 2^^^ possible coalitions of size < K. We represent such a coalition as 

V ^ / 

= {Ml, • • • , Mk], where M^, 1 < A; < K, are drawn i.i.d. uniformly from {!,••• , 2^^}. 
Given a memoryless channel Py\x"^ ^ ^K, the joint p.m.f. of (M^, V, S,X^, Y) is given by 

PM'<vsy.'<Y = Vs Pv n l{x.=/„(s,y,MO})^'y|x«- (^-l) 

i<fc<ft: 

Our derivations make repeated use of the identity 

i{Ua; y\z, Uic\a) - i{Ua; s\z, u,c\a) = HUa; y, z\u,c\a) - i{Ua; s, z\U!c\a) 
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which follows from the chain rule for conditional mutual information and holds for any {U/c, S, Y, Z). 
The total error probability (including false positives and false negatives) for the detect-all decoder is 

Pe{PY\X'<) = Pr[}C ^ K] (D.2) 

when collusion channel py|x^ £ is in effect. 

Step 1. Following the derivation of [10, Eqn. (B.20)] with (Y,S^,y) in place of (Y,S,y) at the 
receiver, for the error probability Pf.{pY\xf^) to vanish for each Py\x'^ ^ ^K, we need 

i?<liminf min min /(M^;Y|S'^,y). (D.3) 

~ N~^oo Py^^k(^Wk ACK N\A\ 

Step 2. Define the i.i.d. random variables 

Wi = {V, Sj,j ^i} (^Vn ^S^-\ l<i<N. (D.4) 
Also define the random variables 

Uki = {Vki, [YS^f-^) = (Mfc, V, 5i^i, [YS^f-^), l<k<K,l<i<N (D.5) 

where Sf^^ ^ (S^+i, ■■■ ,Sn) and {YS'^f-^ ^ (n, Sf, • • • , Yi_^,SU). Hence 

F,^i = (l/,^,5,), Fi^ = f/f, F# = (M^,1^). (D.6) 

The following properties hold for each 1 <i < N: 

. By dnU) and (Id31) . (5^, Wi, Uf) = (M^', V, S, Y^'^) Xf Yi forms a Markov chain. 

• The random variables X^j, 1 < < ii', are conditionally i.i.d. given (S, V) = {Si, Wi). 

• Due to the term Y^~^ in (ID. 5b . the random variables Uki, 1 < k < K, are conditionally dependent 
given {S,V) = {S„Wi). 

The joint p.m.f. of {Si, Wi,Xf^ , Jjf^ ,Yi) may thus be written as 

Ps^Pw, n PXkAs.w, 1 Pf/f |xf 5.iy. Py\x^, l<i< N. (D.7) 

\l<k<K J 

Step 3. Consider a time-sharing random variable T that is uniformly distributed over {1, • • • , A'^} and in- 
dependent of the other random variables, and define the tuple of random variables {S, S'^, W, ,X^ , Y) 
as {St,S!^,Wt,U^,X^,Yt). Also let W = {Wt,T) and Uk = {Uk,T,T), I < k < K, which are 
defined over alphabets of respective cardinalities 

L^N) = N \Vn\\S\''-^ 
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and 

Since {Si, Wi, Uf) ^ Y, forms a Markov chain, so does (S, H^, U^) -^Y. From (IdTtI ). 

the joint p.m.f. of (5, W, U^,X^,Y) takes the form 



yi<fc<i^ / 

In (16.11 ) we have defined the set 

^T^u^wisiPs-: Lw, Lu, Di) = ^yPx'<u'<w\s = Pw (^Y\.PXk\sw^ Pu'^\x'<sw 

■■ Px,\sw = ---=Px^\sw, and EdiS,X^) < D^} (D.9) 
where |W| = and \U\ = L^- Observe that pxkuk^/^ q defined in (ID.8I ) belongs to i!^x'<u'^w\s{Ps^Lyj, 

Define the collection of K indices /C = {1, 2, • • • , K} and the following functionals indexed by C /C: 

JL^^L^AiPs^Px-u-wis^PYlx-) = j^^[I{Ua;YS''\U,c\a)-HUa;S\U!c\^)]. (D.IO) 
Step 4. We have 

I{Mic]Y\S'^,V) I{Mjc;Y\S'^,V)-I{M,c,V;S\S'^) 

= I{Mk,V;Y\S'') - I{V;Y\S'') - I{M^,V;S\S'') 

< I{Mic,V;Y\S'^) - I{M,c,V;S\S'^) 
I{M^,V-YS'')-I{M^,V-S) 

(c) ^ 

< Y.y^{UK,-,YiSf)-I{UK,-Si)] 

i=l 

= I{U,c,t;YS''\T)-I{U,c,t-^S\T) 

= IiU^,T,T- YS'') - I{T- YS") - I{Uk,t, T; S) + /(T; S) 

id) , 

< I{Uk,t,T;YS^)-I{U^.t,T-S) 
I{U,c;YS'')-I{Ujc;S) 

= K JL^(^]\f)^L^(^x),K{PS,PX'<U'<W\S,PY\X'<), (D-H) 

where (a) holds because Mk , V, S are mutually independent, and (b) follows from the chain rule for 
mutual information, (c) from [20, Lemma 4], using V/^ and Uf" in place of Vi and Ui, respectively, (d) 
holds because I{T; S) = 0, and (e) by definition of Uic- 
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For all ^ C /C, we have 

I(M^; Y|S^y) = I{M^,V;Y\S'',V) 

I{Ma, V; Y|S^ V) - liMj^, V; S|S^ Mic\a, V) 
I{M^, V; Y|S^ M,c\A, V) - I{M^, V; S|S^ iV4\^) 
= I{Ma, V; YS''\Mjc\A, V) - /(M^, V; S\S'',M,c\A, V) 

N 

Y,iIiUA,^■,YiSf\U,C\AA)-HUA,i■,Si\U!c\A,^)] (D.12) 
i=l 

= N[IiUA,T;YS''\U,c\A,T,T)-I{UA,T;S\U^\A,T,T)] 

= N [IiUA,T,T; YS''\Uk,\a.t,T) - I{Ua,t,T; S\U^\a,t,T)] 

N[I{UA;YS''\U,c\A)-IiUA;S\Ujc\A)] 

= N\'^\JL^iN),L^{N),A(.PS^PX'<U^W\S,PY\X'<)- (D.13) 

where (a) and (b) hold because M^, S, and V are mutually independent, the equality (c) is proved at 
the end of this section, and (d) follows from the definition of Ufc- 
Combining (1531) . (IdHT) . and (lDl3l) . we obtain 

R < liminf min mmJL^{N),L4N),A(PS^Px^u^W\S,PY\X'') 
(a) 

< sup mm inmJL,^^L,^^A{PS,Px'^U''W\S,PY\X'<) 

< sup max , , min min JL„,L„,^(p5,px^,7icv^j5,pyjX^^ ) 
sup C^'^,^jDi,« 

= lim 

where (a) holds because the functionals Jl^,l^,a{') are nondecreasing in Lw,Lu, (b) uses the definition 
of C^'^ ^ in ( I6.2I ). and (c) the fact that the sequence {C^'^ is nondecreasing. 



March 4, 2008 



DRAFT 



34 



Proof of dEH. Recall the definitions of V^^i = {M^, V, S^^) and UK,^ = {VK,i, {YS'^y-^) in (1531) 
and the recursion (ID.6b for V/c,j- We prove the following inequality: 

I{Ua,u y^Sf\U^\J^^i) - I{UA^i; Si\Uic\A,i) 

= [I{VA,i; {YSy\Vjc\_A,) - I{VA,i; S^\Vjc\a,)] 

-[I{VA,^-l■, {YS''y-'\Vic\A,i-i) - I{VA,^~l■, S'-'\Vic\A,i-i)]- (D.15) 
Then summing both sides of this equality from i = 2 to A^, cancelling terms, and using the properties 
Vfc,i = Uk,i and Vk,N = iMk,V) yields SdJT^ . 

The first of the six terms in (ID. 15b may be expanded as follows: 

I{UA,^■,Y,St\U,C\A,^) = IiVA,^,{YS''y~'■,Y,St\VJC\A,^,{YS^y-') 

= I{VA,i;Y,St\Vic\^,,{YS''y~') 

= IiVA,^,{YS''y-'■,Yst\V!c\A,i) - IaYS''y-'■,Ysf\v,c\A,^) 

= IiUA,;Y,Sf\Vjc\_A^,) - Ii(YS''y-';YiSf\V^\_A^,). (D.16) 
Similarly for the second term, replacing (YS'^) with S in the above derivation, we obtain 

IiUA,^■,Si\U,C\AA) = IiUA,^■,Si\V^\A,^) " I {{Y S'^y^' ; Si\V,c\A,i) ■ (D-17) 
The six terms in (ID.15I) can be expanded using the chain rule for mutual information, in the same way 
as in [20, Lemma 4.2]: 

I{VAy,{YS''y\Vic\A,i) = I{VA,i■,{YS''y-'\V^\A,i) + HVA,u{YS%\V,c\A,^) (D.18) 

IiVA,^■,S^V^\A,) = IiVA,^■,S'-'\V,c\A,)+HVA,^■,S,\V^\A,^) (D-19) 

I{VA,i-i;S'-'\V,c\A,i^i) = IiVA,uS'-'\Si,V^\A,i-i) (D-20) 
I{VA,^^i;{YS''y-'\V,c\A,i-i) = IiVA,^■,{yS''y-'\S,,V^\A,^-l) (D.21) 

i{UA,uSi\v^\A,i) = I{{YS''y-'■,s,\v,c\A,i) + HyA,^■,s^\{YS''y-\v^\A,i) (D-22) 
i{Ua,; {ys%\v>c\a,) = HiYS^y-'; {YS%\v^\A,^) + HVA,i; {YS%\{Ys^y-\ v,c\a,)- 

(D.23) 

Moreover, expanding the conditional mutual information I{Va,u Si, {Y S'^y~^\Vic\A,i) in two different 



ways, we obtain 



IiVA,^■,(YS''y-'\v^\A,^) + HVA,^■,s^\iYS''y-\v^\A,i: 



= I{VaX, S'-'\VK,\A,i) + KVa,: {YS''y-'\S,, V^\A,i)- (D-24) 
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Substracting the sum of (IdTtI ). (IdTTSI ). (lOlOb . (10221 ). (10241) from the sum of (IDJ61 ). ( IDJ91 ). (lOlTl) . 
(ID.23I ). and cancelUng terms, we obtain (ID. 151) . from which the claim follows. □ 
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